60 research outputs found

    14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon

    Full text link
    Chemistry and materials science are complex. Recently, there have been great successes in addressing this complexity using data-driven or computational techniques. Yet, the necessity of input structured in very specific forms and the fact that there is an ever-growing number of tools creates usability and accessibility challenges. Coupled with the reality that much data in these disciplines is unstructured, the effectiveness of these tools is limited. Motivated by recent works that indicated that large language models (LLMs) might help address some of these issues, we organized a hackathon event on the applications of LLMs in chemistry, materials science, and beyond. This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of molecules and materials, designing novel interfaces for tools, extracting knowledge from unstructured data, and developing new educational applications. The diverse topics and the fact that working prototypes could be generated in less than two days highlight that LLMs will profoundly impact the future of our fields. The rich collection of ideas and projects also indicates that the applications of LLMs are not limited to materials science and chemistry but offer potential benefits to a wide range of scientific disciplines

    Theory and application of medium to high throughput prediction method techniques for asymmetric catalyst design

    No full text
    With the use of computational methods in the field of drug design becoming ever more prevalent, there is pressure to port these technologies to other fields. One of the fields ripe for application of computational drug design techniques; specifically virtual screening and computer-aided molecular design, is the design and synthesis of asymmetric catalysts. Such methods could either guide the selection of the optimal catalyst(s) for a given reaction and a given substrate or provide an enriched selection of highly efficient asymmetric catalysts which enable the synthetic chemists to focus on the most promising candidates. This would in turn provide savings in time and reduce the costs associated with the synthesis and evaluation of large libraries of molecules. However, to be applicable to the evaluation of a large number of potential catalysts, speed is of utmost importance. This impetus has led to the development of medium to high throughput virtual screening (HTVS) methods for asymmetric catalyst development or assessment, although a very few applications have been reported. These methods typically fall into four classes: methods combining quantum mechanics and molecular mechanics (QM/MM), pure molecular mechanics-based methods \u2013 a class which can be subdivided into static and dynamic transition state modeling \u2013 and lastly quantitative structure selectivity relationship methods (QSSR). This review will cover specific methods within these classes and their application to selected reactions.Peer reviewed: YesNRC publication: Ye

    Single-Point Mutation with a Rotamer Library Toolkit: Toward Protein Engineering

    No full text
    Protein engineers have long been hard at work to harness biocatalysts as a natural source of regio-, stereo-, and chemoselectivity in order to carry out chemistry (reactions and/or substrates) not previously achieved with these enzymes. The extreme labor demands and exponential number of mutation combinations have induced computational advances in this domain. The first step in our virtual approach is to predict the correct conformations upon mutation of residues (i.e., rebuilding side chains). For this purpose, we opted for a combination of molecular mechanics and statistical data. In this work, we have developed automated computational tools to extract protein structural information and created conformational libraries for each amino acid dependent on a variable number of parameters (e.g., resolution, flexibility, secondary structure). We have also developed the necessary tool to apply the mutation and optimize the conformation accordingly. For side-chain conformation prediction, we obtained overall average root-mean-square deviations (RMSDs) of 0.91 and 1.01 Ã… for the 18 flexible natural amino acids within two distinct sets of over 3000 and 1500 side-chain residues, respectively. The commonly used dihedral angle differences were also evaluated and performed worse than the state of the art. These two metrics are also compared. Furthermore, we generated a family-specific library for kinases that produced an average 2% lower RMSD upon side-chain reconstruction and a residue-specific library that yielded a 17% improvement. Ultimately, since our protein engineering outlook involves using our docking software, Fitted/Impacts, we applied our mutation protocol to a benchmarked data set for self- and cross-docking. Our side-chain reconstruction does not hinder our docking software, demonstrating differences in pose prediction accuracy of approximately 2% (RMSD cutoff metric) for a set of over 200 protein/ligand structures. Similarly, when docking to a set of over 100 kinases, side-chain reconstruction (using both general and biased conformation libraries) had minimal detriment to the docking accuracy

    Customizable Generation of Synthetically Accessible, Local Chemical Subspaces

    No full text
    Screening large libraries of chemicals has been an efficient strategy to discover bioactive compounds; however a portion of the potential for success is limited to the available libraries. Synergizing combinatorial and computational chemistries has emerged as a time-efficient strategy to explore the chemical space more widely. Ideally, streamlining the evaluation process for larger, feasible chemical libraries would become commonplace. Thus, combinatorial tools and, for example, docking methods would be integrated to identify novel bioactive entities. The idea is simple in nature, but much more complex in practice; combinatorial chemistry is more than the coupling of chemicals into products: synthetic feasibility includes chemoselectivity, stereoselectivity, protecting group chemistry, and chemical availability which must all be considered for combinatorial library design. In addition, intuitive interfaces and simple user manipulation is key for optimal use of such tools by organic chemistscrucial for the integration of such software in medicinal chemistry laboratories. We present herein Finders and React2Dintegrated into the Virtual Chemist platform, a modular software suite. This approach enhances virtual combinatorial chemistry by identifying available chemicals compatible with a user-defined chemical transformation and by carrying out the reaction leading to libraries of realistic, synthetically accessible chemicalsall with a completely automated, black-box, and efficient design. We demonstrate its utility by generating ∼40 million synthetically accessible, stereochemically accurate compounds from a single library of 100 000 purchasable molecules and 56 well-characterized chemical reactions

    The Second CACHE Challenge - Targeting the RNA-Binding Pocket of the SARS-CoV2 Nonstructural Protein 13 via a consensus-scoring method and FITTED templated docking.

    No full text
    Disrupting the Nonstructural Protein 13 (NSP13) in SARS-CoV2 could provide a great avenue for the treatment of COVID-19 and help reduce its enormous health burden. As part of the second CACHE challenge, we targeted each of two sub-pockets of the NSP13 RNA-binding site via a multi-pronged virtual screening (VS) campaign, using the latest functionality in FITTED, our docking program, part of the FORECASTER drug discovery suite. After extensive structure preparation and docking (rigid, flexible), we evaluated predicted poses from the VS using four approaches: docking score, machine learning (graph neural network), quantum-mechanics, and visualization, with the final selection being based on the consensus of all four approaches. Additionally, we implemented templated docking within FITTED to take advantage of fragments co-crystallized with NSP13, which supplemented our consensus selection. We now await the experimental testing of our predictions by the Structural Genomics Consortium, and once available, we will update this manuscript accordingly. In sharing our approach and findings, we hope to continue contributing to open science, and engaging in the ongoing effort of the scientific community towards ending COVID-19

    Fluoride-Mediated Desulfonylative Intramolecular Cyclization to Fused and Bridged Bicyclic Compounds: A Complex Mechanism

    No full text
    We previously reported the synthesis of polysubstituted chiral oxazepanes in three steps from commercially available starting materials. The unexpected reaction of one of these 1,4-oxazepanes in the presence of TBAF provided a 4-oxa-1-azabicyclo[4.1.0]­heptane core. This unusual process significantly increased the complexity of the molecular scaffold by introducing a bicyclic core. Surprisingly, the generated bicyclic structure featuring three stereocenters was a mixture of enantiomers with no other diastereomers observed. These striking experimental observations deserved further investigations. A combination of experimental and computational investigations unveiled a complex diastereoselective mechanism. Mechanistic rationale is presented for this observed rearrangement
    • …
    corecore